======================================================================
USING NVCC 2.3
======================================================================
ptxas info    : Compiling entry function '_Z28kernel_bspline_mse_2_mysidiaPfS_S_S_PiS0_4int4fi'
ptxas info    : Used 25 registers, 64+16 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '_Z24kernel_row_to_tile_majorPfS_4int3S0_'
ptxas info    : Used 18 registers, 32+16 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '_Z26kernel_bspline_mse_2_hyperPfS_S_S_S_S_4int4PiS1_fi'
ptxas info    : Used 13 registers, 64+16 bytes smem, 36 bytes cmem[1]
ptxas info    : Compiling entry function '_Z22kernel_deinterleave_XPiPfS_S_S_'
ptxas info    : Used 4 registers, 32+16 bytes smem, 24 bytes cmem[1]
ptxas info    : Compiling entry function '_Z19kernel_deinterleaveiPfS_S_S_'
ptxas info    : Used 4 registers, 32+16 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '_Z37bspline_cuda_compute_grad_norm_kernelPfS_i'
ptxas info    : Used 7 registers, 16+16 bytes smem, 8 bytes cmem[1]
ptxas info    : Compiling entry function '_Z37bspline_cuda_compute_grad_mean_kernelPfS_i'
ptxas info    : Used 7 registers, 16+16 bytes smem, 8 bytes cmem[1]
ptxas info    : Compiling entry function '_Z31bspline_cuda_update_grad_kernelPfii'
ptxas info    : Used 7 registers, 16+16 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '_Z30kernel_sum_reduction_last_stepPfS_i'
ptxas info    : Used 7 registers, 16+16 bytes smem, 4 bytes cmem[1]
ptxas info    : Compiling entry function '_Z20kernel_sum_reductionPfS_i'
ptxas info    : Used 7 registers, 16+16 bytes smem, 8 bytes cmem[1]
ptxas info    : Compiling entry function '_Z20kernel_bspline_mse_2PfS_i4int3S0_S0_'
ptxas info    : Used 44 registers, 48+16 bytes smem, 36 bytes cmem[1]
ptxas info    : Compiling entry function '_Z20kernel_bspline_mse_1PfS_S_S_S_S_4int36float3S1_S1_S0_S0_S0_S1_S0_S0_'
ptxas info    : Used 27 registers, 144+16 bytes smem, 44 bytes cmem[1]
ptxas info    : Compiling entry function '_Z15kernel_gradientPf4dim3S0_jf'
ptxas info    : Used 13 registers, 48+16 bytes smem, 12 bytes cmem[1]


======================================================================
USING NVCC 2.0
======================================================================
ptxas info    : Compiling entry function '__globfunc__Z28kernel_bspline_mse_2_mysidiaPfS_S_S_PiS0_4int4fi'
ptxas info    : Used 25 registers, 80+80 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z24kernel_row_to_tile_majorPfS_4int3S0_'
ptxas info    : Used 18 registers, 48+48 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z26kernel_bspline_mse_2_hyperPfS_S_S_S_S_4int4PiS1_fi'
ptxas info    : Used 12 registers, 80+80 bytes smem, 32 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z22kernel_deinterleave_XPiPfS_S_S_'
ptxas info    : Used 4 registers, 48+48 bytes smem, 24 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z19kernel_deinterleaveiPfS_S_S_'
ptxas info    : Used 4 registers, 48+48 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z37bspline_cuda_compute_grad_norm_kernelPfS_i'
ptxas info    : Used 7 registers, 32+32 bytes smem, 8 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z37bspline_cuda_compute_grad_mean_kernelPfS_i'
ptxas info    : Used 7 registers, 32+32 bytes smem, 8 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z31bspline_cuda_update_grad_kernelPfii'
ptxas info    : Used 7 registers, 32+32 bytes smem, 12 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z30kernel_sum_reduction_last_stepPfS_i'
ptxas info    : Used 7 registers, 32+32 bytes smem, 4 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z20kernel_sum_reductionPfS_i'
ptxas info    : Used 7 registers, 32+32 bytes smem, 8 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z20kernel_bspline_mse_2PfS_i4int3S0_S0_'
ptxas info    : Used 38 registers, 64+64 bytes smem, 32 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z20kernel_bspline_mse_1PfS_S_S_S_S_4int36float3S1_S1_S0_S0_S0_S1_S0_S0_'
ptxas info    : Used 30 registers, 160+160 bytes smem, 52 bytes cmem[1]
ptxas info    : Compiling entry function '__globfunc__Z15kernel_gradientPf4dim3S0_jfS_'
ptxas info    : Used 13 registers, 64+64 bytes smem, 12 bytes cmem[1]

